laplacian regularization
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > Texas (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning
As annotations of data can be scarce in large-scale practical problems, leveraging unlabelled examples is one of the most important aspects of machine learning. This is the aim of semi-supervised learning. To benefit from the access to unlabelled data, it is natural to diffuse smoothly knowledge of labelled data to unlabelled one. This induces to the use of Laplacian regularization. Yet, current implementations of Laplacian regularization suffer from several drawbacks, notably the well-known curse of dimensionality.
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > Texas (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning
As annotations of data can be scarce in large-scale practical problems, leveraging unlabelled examples is one of the most important aspects of machine learning. This is the aim of semi-supervised learning. To benefit from the access to unlabelled data, it is natural to diffuse smoothly knowledge of labelled data to unlabelled one. This induces to the use of Laplacian regularization. Yet, current implementations of Laplacian regularization suffer from several drawbacks, notably the well-known curse of dimensionality.
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning
As annotations of data can be scarce in large-scale practical problems, leveraging unlabelled examples is one of the most important aspects of machine learning. This is the aim of semi-supervised learning. To benefit from the access to unlabelled data, it is natural to diffuse smoothly knowledge of labelled data to unlabelled one. This induces to the use of Laplacian regularization. Yet, current implementations of Laplacian regularization suffer from several drawbacks, notably the well-known curse of dimensionality.
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.50)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Graph-Based Methods for Discrete Choice
Tomlinson, Kiran, Benson, Austin R.
Choices made by individuals have widespread impacts--for instance, people choose between political candidates to vote for, between social media posts to share, and between brands to purchase--moreover, data on these choices are increasingly abundant. Discrete choice models are a key tool for learning individual preferences from such data. Additionally, social factors like conformity and contagion influence individual choice. Traditional methods for incorporating these factors into choice models do not account for the entire social network and require hand-crafted features. To overcome these limitations, we use graph learning to study choice in networked contexts. We identify three ways in which graph learning techniques can be used for discrete choice: learning chooser representations, regularizing choice model parameters, and directly constructing predictions from a network. We design methods in each category and test them on real-world choice datasets, including county-level 2016 US election results and Android app installation and usage data. We show that incorporating social network structure can improve the predictions of the standard econometric choice model, the multinomial logit. We provide evidence that app installations are influenced by social context, but we find no such effect on app usage among the same participants, which instead is habit-driven. In the election data, we highlight the additional insights a discrete choice framework provides over classification or regression, the typical approaches. On synthetic data, we demonstrate the sample complexity benefit of using social information in choice models.
- North America > United States > California (0.14)
- North America > United States > South Dakota > Oglala Lakota County (0.14)
- North America > United States > Utah (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Information Technology (1.00)
- Government > Voting & Elections (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Alternating minimization algorithms for graph regularized tensor completion
Guan, Yu, Dong, Shuyu, Gao, Bin, Absil, P. -A., Glineur, François
We consider a Canonical Polyadic (CP) decomposition approach to low-rank tensor completion (LRTC) by incorporating external pairwise similarity relations through graph Laplacian regularization on the CP factor matrices. The usage of graph regularization entails benefits in the learning accuracy of LRTC, but at the same time, induces coupling graph Laplacian terms that hinder the optimization of the tensor completion model. In order to solve graph-regularized LRTC, we propose efficient alternating minimization algorithms by leveraging the block structure of the underlying CP decomposition-based model. For the subproblems of alternating minimization, a linear conjugate gradient subroutine is specifically adapted to graph-regularized LRTC. Alternatively, we circumvent the complicating coupling effects of graph Laplacian terms by using an alternating directions method of multipliers. Based on the Kurdyka-{\L}ojasiewicz property, we show that the sequence generated by the proposed algorithms globally converges to a critical point of the objective function. Moreover, the complexity and convergence rate are also derived. In addition, numerical experiments including synthetic data and real data show that the graph regularized tensor completion model has improved recovery results compared to those without graph regularization, and that the proposed algorithms achieve gains in time efficiency over existing algorithms.
K-Nearest-Neighbors Induced Topological PCA for scRNA Sequence Data Analysis
Cottrell, Sean, Hozumi, Yuta, Wei, Guo-Wei
Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity in cells, which has given us insights into cell-cell communication, cell differentiation, and differential gene expression. However, analyzing scRNA-seq data is a challenge due to sparsity and the large number of genes involved. Therefore, dimensionality reduction and feature selection are important for removing spurious signals and enhancing downstream analysis. Traditional PCA, a main workhorse in dimensionality reduction, lacks the ability to capture geometrical structure information embedded in the data, and previous graph Laplacian regularizations are limited by the analysis of only a single scale. We propose a topological Principal Components Analysis (tPCA) method by the combination of persistent Laplacian (PL) technique and L$_{2,1}$ norm regularization to address multiscale and multiclass heterogeneity issues in data. We further introduce a k-Nearest-Neighbor (kNN) persistent Laplacian technique to improve the robustness of our persistent Laplacian method. The proposed kNN-PL is a new algebraic topology technique which addresses the many limitations of the traditional persistent homology. Rather than inducing filtration via the varying of a distance threshold, we introduced kNN-tPCA, where filtrations are achieved by varying the number of neighbors in a kNN network at each step, and find that this framework has significant implications for hyper-parameter tuning. We validate the efficacy of our proposed tPCA and kNN-tPCA methods on 11 diverse benchmark scRNA-seq datasets, and showcase that our methods outperform other unsupervised PCA enhancements from the literature, as well as popular Uniform Manifold Approximation (UMAP), t-Distributed Stochastic Neighbor Embedding (tSNE), and Projection Non-Negative Matrix Factorization (NMF) by significant margins.
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- North America > United States > Michigan > Ingham County > East Lansing (0.04)
- North America > United States > Missouri > Greene County > Springfield (0.04)
- Workflow (1.00)
- Research Report (1.00)
Learning on Graph with Laplacian Regularization
We consider a general form of transductive learning on graphs with Laplacian regularization, and derive margin-based generalization bounds using appropriate geometric properties of the graph. We use this analysis to obtain a better understanding of the role of normalization of the graph Laplacian matrix as well as the effect of dimension reduction. The results suggest a limitation of the standard degree-based normalization. We propose a remedy from our analysis and demonstrate empirically that the remedy leads to improved classification performance.